{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# LlamaParse With MongoDB\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_cloud_services/blob/main/examples/parse/demo_mongodb.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "In this notebook, we provide a straightforward example of using LlamaParse with MongoDB Atlas VectorSearch.\n",
    "\n",
    "We illustrate the process of using llama-parse to parse a PDF document, then index the document with a MongoDB vector store, and subsequently perform basic queries against this store.\n",
    "\n",
    "This notebook is structured similarly to quick start guides, aiming to introduce users to utilizing llama-parse in conjunction with a MongoDB Atlas VectorSearch.\n",
    "\n",
    "Status:\n",
    "| Last Executed | Version | State      |\n",
    "|---------------|---------|------------|\n",
    "| Aug-19-2025   | 0.6.61  | Maintained |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-cloud-services\n",
    "%pip install \"llama-index-vector-stores-mongodb>=0.8.0<0.9.0\" \"llama-index>=0.13.0<0.14.0\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup API Keys"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\n",
    "    \"LLAMA_CLOUD_API_KEY\"\n",
    "] = \"llx-...\"  # Get it from https://cloud.llamaindex.ai/api-key\n",
    "os.environ[\n",
    "    \"OPENAI_API_KEY\"\n",
    "] = \"sk-...\"  # Get it from https://platform.openai.com/api-keys"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "import pymongo\n",
    "\n",
    "from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch\n",
    "from llama_cloud_services import LlamaParse\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.core import VectorStoreIndex, StorageContext\n",
    "from llama_index.core.node_parser import SentenceSplitter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import Settings\n",
    "from llama_index.llms.openai import OpenAI\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "\n",
    "Settings.llm = OpenAI(model=\"gpt-5-mini\")\n",
    "Settings.embed_model = OpenAIEmbedding(model=\"text-embedding-3-small\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download Document\n",
    "\n",
    "We will use `Attention is all you need` paper."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Download complete.\n"
     ]
    }
   ],
   "source": [
    "# The URL of the file you want to download\n",
    "url = \"https://arxiv.org/pdf/1706.03762.pdf\"\n",
    "# The local path where you want to save the file\n",
    "file_path = \"./attention.pdf\"\n",
    "\n",
    "# Perform the HTTP request\n",
    "response = requests.get(url)\n",
    "\n",
    "# Check if the request was successful\n",
    "if response.status_code == 200:\n",
    "    # Open the file in binary write mode and save the content\n",
    "    with open(file_path, \"wb\") as file:\n",
    "        file.write(response.content)\n",
    "    print(\"Download complete.\")\n",
    "else:\n",
    "    print(\"Error downloading the file.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Parse the document using `LlamaParse`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Started parsing the file under job_id 993fa45f-f4ed-4d49-9032-794b3470305a\n",
      "."
     ]
    }
   ],
   "source": [
    "result = await LlamaParse(\n",
    "    parse_mode=\"parse_page_with_agent\",\n",
    "    model=\"openai-gpt-4-1-mini\",\n",
    "    high_res_ocr=True,\n",
    "    adaptive_long_table=True,\n",
    "    outlined_table_extraction=True,\n",
    "    output_tables_as_HTML=True,\n",
    ").aparse(file_path)\n",
    "\n",
    "documents = result.get_text_documents(split_by_page=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " sub-layer, which performs multi-head\n",
      "attention over the output of the encoder stack. Similar to the encoder, we employ residual connections\n",
      "around each of the sub-layers, followed by layer normalization. We also modify the self-attention\n",
      "sub-layer in the decoder stack to prevent positions from attending to subsequent positions. This\n",
      "masking, combined with fact that the output embeddings are offset by one position, ensures that the\n",
      "predictions for position i can depend only on the known outputs at positions less than i.\n",
      "\n",
      "3.2  Attention\n",
      "An attention function can be described as mapping a query and a set of key-value pairs to an output,\n",
      "where the query, keys, values, and output are all vectors. The output is computed as a weighted sum\n",
      "\n",
      "                                     3\n",
      "---\n",
      "              Scaled Dot-Product Attention    Multi-Head Attention\n",
      "\n",
      "                                                                          Linear\n",
      "              MatMul\n",
      "\n",
      "  SoftMax                                 \n"
     ]
    }
   ],
   "source": [
    "# Take a quick look at some of the parsed text from the document:\n",
    "print(documents[0].get_content()[10000:11000])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create `MongoDBAtlasVectorSearch`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mongo_uri = \"<mongodb_uri>\"\n",
    "\n",
    "mongodb_client = pymongo.MongoClient(mongo_uri)\n",
    "mongodb_vector_store = MongoDBAtlasVectorSearch(mongodb_client)\n",
    "\n",
    "mongodb_vector_store.create_vector_search_index(\n",
    "    dimensions=1536, path=\"embedding\", similarity=\"cosine\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create nodes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "node_parser = SentenceSplitter()\n",
    "\n",
    "nodes = node_parser.get_nodes_from_documents(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create Index and Query Engine."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "storage_context = StorageContext.from_defaults(vector_store=mongodb_vector_store)\n",
    "\n",
    "index = VectorStoreIndex(\n",
    "    nodes=nodes,\n",
    "    storage_context=storage_context,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine(similarity_top_k=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test Query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "***********New LlamaParse+ Basic Query Engine***********\n",
      "28.4 BLEU\n"
     ]
    }
   ],
   "source": [
    "query = \"What is BLEU score on the WMT 2014 English-to-German translation task?\"\n",
    "\n",
    "response = query_engine.query(query)\n",
    "print(\"\\n***********New LlamaParse+ Basic Query Engine***********\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "For our big models,(described on the\n",
      "bottom line of table 3), step time was 1.0 seconds. The big models were trained for 300,000 steps\n",
      "(3.5 days).\n",
      "\n",
      "5.3  Optimizer\n",
      "\n",
      "We used the Adam optimizer [20] with β1 = 0.9, β2 = 0.98 and ϵ = 10−9. We varied the learning\n",
      "rate over the course of training, according to the formula:\n",
      "\n",
      "              lrate = d−0.5 · min(step_num−0.5, step_num · warmup_steps−1.5)       (3)\n",
      "              model\n",
      "\n",
      "This corresponds to increasing the learning rate linearly for the first warmup_steps training steps,\n",
      "and decreasing it thereafter proportionally to the inverse square root of the step number. We used\n",
      "warmup_steps = 4000.\n",
      "\n",
      "5.4  Regularization\n",
      "\n",
      "We employ three types of regularization during training:\n",
      "\n",
      "                                        7\n",
      "---\n",
      "Table 2: The Transformer achieves better BLEU scores than previous state-of-the-art models on the\n",
      "English-to-German and English-to-French newstest2014 tests at a fraction of the training cost.\n",
      "\n",
      "     Model                               BLEU            Training Cost (FLOPs)\n",
      "                                         EN-DE EN-FR                   EN-DE  EN-FR\n",
      "     ByteNet [18]               23.75\n",
      "     Deep-Att + PosUnk [39]                     39.2                          1.0 · 1020\n",
      "     GNMT + RL [38]              24.6          39.92     2.3 · 1019           1.4 · 1020\n",
      "     ConvS2S [9]                25.16          40.46     9.6 · 1018           1.5 · 1020\n",
      "     MoE [32]                   26.03          40.56     2.0 · 1019           1.2 · 1020\n",
      "     Deep-Att + PosUnk Ensemble [39]            40.4                          8.0 · 1020\n",
      "     GNMT + RL Ensemble [38]    26.30          41.16     1.8 · 1020           1.1 · 1021\n",
      "     ConvS2S Ensemble [9]       26.36          41.29     7.7 · 1019           1.2 · 1021\n",
      "     Transformer (base model)    27.3           38.1     3.3 · 1018\n",
      "     Transformer (big)           28.4           41.8     2.3 · 1019\n",
      "\n",
      "Residual Dropout      We apply dropout [33] to the output of each sub-layer, before it is added to the\n",
      "sub-layer input and normalized. In addition, we apply dropout to the sums of the embeddings and the\n",
      "positional encodings in both the encoder and decoder stacks. For the base model, we use a rate of\n",
      "Pdrop = 0.1.\n",
      "\n",
      "Label Smoothing             During training, we employed label smoothing of value ϵls = 0.1 [36]. This\n",
      "hurts perplexity, as the model learns to be more unsure, but improves accuracy and BLEU score.\n",
      "\n",
      "6    Results\n",
      "\n",
      "6.1  Machine Translation\n",
      "\n",
      "On the WMT 2014 English-to-German translation task, the big transformer model (Transformer (big)\n",
      "in Table 2) outperforms the best previously reported models (including ensembles) by more than 2.0\n",
      "BLEU, establishing a new state-of-the-art BLEU score of 28.4. The configuration of this model is\n",
      "listed in the bottom line of Table 3. Training took 3.5 days on 8 P100 GPUs. Even our base model\n",
      "surpasses all previously published models and ensembles, at a fraction of the training cost of any of\n",
      "the competitive models.\n",
      "On the WMT 2014 English-to-French translation task, our big model achieves a BLEU score of 41.0,\n",
      "outperforming all of the previously published single models, at less than 1/4 the training cost of the\n",
      "previous state-of-the-art model. The Transformer (big) model trained for English-to-French used\n",
      "dropout rate Pdrop = 0.1, instead of 0.3.\n",
      "For the base models, we used a single model obtained by averaging the last 5 checkpoints, which\n",
      "were written at 10-minute intervals. For the big models, we averaged the last 20 checkpoints. We\n",
      "used beam search with a beam size of 4 and length penalty α = 0.6 [38].\n"
     ]
    }
   ],
   "source": [
    "# Take a look at one of the source nodes from the response\n",
    "print(response.source_nodes[0].get_content())"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
